Event-based Failure Prediction
نویسندگان
چکیده
Human lives and organizations are increasingly dependent on the correct functioning of computer systems and their failure might cause personal as well as economic damage. There are two non-exclusive approaches to minimize the risk of such hazards: (a) faultintolerance tries to eliminate design and manufacturing faults for hardware and software before a system is put into service. (b) fault-tolerance techniques deal with faults that occur during service trying to avert that faults turn into failures. Since faults, in most cases, cannot be ruled out, we focus on the second approach. Traditionally, fault tolerance has followed a reactive scheme of fault detection, location and subsequent recovery by redundancy either in space or time. However, in recent years the focus has changed from these reactive methods towards more proactive schemes that try to evaluate the current situation of a running system in order to start acting even before the failure occurs. Once a failure is predicted, it may either be prevented or the outage may be shifted from unplanned to planned downtime, which can both improve significantly the system’s reliability. The first step in this approach, online failure prediction, is the main focus of this thesis. The objective of the online failure prediction is to predict the occurrence of failures in the near future based on the current state of the system as it is observed by runtime monitoring. A new failure prediction method that builds on the evaluation of error events is introduced in this dissertation. More specifically, it treats the occurrence of errors as an event-driven temporal sequence and applies a pattern recognition technique in order to predict upcoming failures. Hidden Markov models have successfully solved many pattern recognition tasks. However, standard hidden Markov models are not well-suited to processing sequences in continuous time and existing augmentations do not account adequately for the event-driven character of error sequences. Hence, an extension of hidden Markov models has been developed that employs a semi-Markov process to state traversals providing the flexibility to model a great variety of temporal characteristics of the underlying stochastic process. The proposed hidden semi-Markov model has been applied to industrial data of a commercial telecommunication platform. The case study showed significantly improved failure prediction capabilities in comparison to well-known existing approaches. The case study also demonstrated that hidden semi-Markov models perform significantly better than standard hidden Markov models. In order to assess the impact of failure prediction and subsequent actions, a reliability model has been developed that enables to compute steady-state system availability, reliability and hazard rate. Based on the model, it is shown that such approaches can significantly improve system dependability.
منابع مشابه
Prediction of long-term cardiac events by 123I-MIBG imaging after acute myocardial infarction and reperfusion therapy
Objective(s): In heart failure, the heart-to-mediastinum (H/M) ratio of the delayed image and washout rate (WR) are well-known as a powerful cardiac event predictors. H/M ratio quantifies the accumulation rate of MIBG in the myocardium and WR quantifies reduction of meta-iodobenzylguanidine (MIBG) accumulation in the heart from the early planar image to the delayed pla...
متن کاملFailure event prediction using the Cox proportional hazard model driven by frequent failure signatures
The analysis of event sequence data that contains system failures is becoming increasingly important in the design of service and maintenance policies. This paper presents a systematic methodology to construct a statistical prediction model for failure event based on event sequence data. First, frequent failure signatures, defined as a group of events/errors that repeatedly occur together, are ...
متن کاملPrediction of Times to Failure of Censored Units in Hybrid Censored Samples from Exponential Distribution
In this paper, we discuss different predictors of times to failure of units censored in a hybrid censored sample from exponential distribution. Bayesian and non-Bayesian point predictors for the times to failure of units are obtained. Non-Bayesian prediction Intervals are obtained based on pivotal and highest conditional density methods. Bayesian prediction intervals are also proposed. One real...
متن کاملHybrid Method of Logistic Regression and Data Envelopment Analysis for Event Prediction: A Case Study (Stroke Disease)
Abstract Predictive analytics is an area of statistics that deals with extracting information from data and using it to predict trends and behavior patterns. Many mathematical modeling has been developed and used for prediction, and in some cases, they have been found to be very strong and reliable. This paper studies different mathematical and statistical approaches for events prediction. The ...
متن کاملA Hybrid Business Success Versus Failure Classification Prediction Model: A Case of Iranian Accelerated Start-ups
The purpose of this study is to reduce the uncertainty of early stage startups success prediction and filling the gap of previous studies in the field, by identifying and evaluating the success variables and developing a novel business success failure (S/F) data mining classification prediction model for Iranian start-ups. For this purpose, the paper is seeking to extend Bill Gross and Robert L...
متن کاملDetermination of optimum of production rate of network failure prone manufacturing systems with perishable items using discrete event simulation and Taguchi design of experiment
This paper, considers Network Failure Manufacturing System (NFPMS) and production control policy of unreliable multi-machines, multi-products with perishable items. The production control policy is based on the Hedging Point Policy (HPP). The important point in the simulation of this system is assumed that the customers who receive perishable item are placed in priority queue of the customers w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008